Disease gene classification with metagraph representations.
نویسندگان
چکیده
Protein-protein interaction (PPI) networks play an important role in studying the functional roles of proteins, including their association with diseases. However, protein interaction networks are not sufficient without the support of additional biological knowledge for proteins such as their molecular functions and biological processes. To complement and enrich PPI networks, we propose to exploit biological properties of individual proteins. More specifically, we integrate keywords describing protein properties into the PPI network, and construct a novel PPI-Keywords (PPIK) network consisting of both proteins and keywords as two different types of nodes. As disease proteins tend to have a similar topological characteristics on the PPIK network, we further propose to represent proteins with metagraphs. Different from a traditional network motif or subgraph, a metagraph can capture a particular topological arrangement involving the interactions/associations between both proteins and keywords. Based on the novel metagraph representations for proteins, we further build classifiers for disease protein classification through supervised learning. Our experiments on three different PPI databases demonstrate that the proposed method consistently improves disease protein prediction across various classifiers, by 15.3% in AUC on average. It outperforms the baselines including the diffusion-based methods (e.g., RWR) and the module-based methods by 13.8-32.9% for overall disease protein prediction. For predicting breast cancer genes, it outperforms RWR, PRINCE and the module-based baselines by 6.6-14.2%. Finally, our predictions also turn out to have better correlations with literature findings from PubMed.
منابع مشابه
Fuzzy Metagraph Based Knowledge Representation of Decision Support System
This paper proposes a Fuzzy Metagraph based Knowledge representation of Decision Support System (DSS). This system will help users to make correct decision with very low risk. Fuzzy Metagraph is an emerging technique widely used for real world applications. Fuzzy Metagraph is used to form the rule base to support inference system to make correct decision. This method can be used in many real wo...
متن کاملFuzzy Metagraph and Vague Metagraph based Techniques and their Applications
Metagraphs are graphical hierarchical structure in which every node is a set having one or more elements. Fuzzy Metagraph and Vague Metagraph are an emerging technique used in the design of many information processing systems like transaction processing systems, Decision Support Systems (DSS), and workflow Systems. In this paper, distinct matrixes have been proposed for Fuzzy Metagraph and Vagu...
متن کاملFuzzy Meta Node Fuzzy Metagraph and its Cluster Analysis
Problem statement: In this study researchers propose a new fuzzy graph theoretic construct called fuzzy metagraph and a new method of clustering finding the similar fuzzy nodes in a fuzzy metagraph. Approach: We adopted T-norms (Triangular Norms) functions and join two or more Tnorms to cluster the fuzzy nodes. Fuzzy metagraph is the fuzzyfication of the crisp Metagraphs using fuzzy Generating ...
متن کاملFeature Selection and Classification of Microarray Gene Expression Data of Ovarian Carcinoma Patients using Weighted Voting Support Vector Machine
We can reach by DNA microarray gene expression to such wealth of information with thousands of variables (genes). Analysis of this information can show genetic reasons of disease and tumor differences. In this study we try to reduce high-dimensional data by statistical method to select valuable genes with high impact as biomarkers and then classify ovarian tumor based on gene expression data of...
متن کاملClassification and properties of acyclic discrete phase-type distributions based on geometric and shifted geometric distributions
Acyclic phase-type distributions form a versatile model, serving as approximations to many probability distributions in various circumstances. They exhibit special properties and characteristics that usually make their applications attractive. Compared to acyclic continuous phase-type (ACPH) distributions, acyclic discrete phase-type (ADPH) distributions and their subclasses (ADPH family) have ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Methods
دوره 131 شماره
صفحات -
تاریخ انتشار 2017